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Abstract 

During the last five years, serial femtosecond crystallography using x-ray laser pulses has developed into 
a powerful technique for determining the atomic structures of protein molecules from micrometer and sub¬ 
micrometer sized crystals. One of the key reasons for this success is the “self-gating” pulse effect, whereby 
the x-ray laser pulses do not need to outrun all radiation damage processes. Instead, x-ray induced damage 
terminates the Bragg diffraction prior to the pulse completing its passage through the sample, as if the Bragg 
diffraction was generated by a shorter pulse of equal intensity. As a result, serial femtosecond crystallography 
does not need to be performed with pulses as short as 5-10 fs, as once thought, but can succeed for pulses 
50-100 fs in duration. We show here that a similar gating effect applies to single molecule diffraction with 
respect to spatially uncorrelated damage processes like ionization and ion diffusion. The effect is clearly 
seen in calculations of the diffraction contrast, by calculating the diffraction of average structure separately 
to the diffraction from statistical fluctuations of the structure due to damage (“damage noise”). Our results 
suggest that sub-nanometer single molecule imaging with 30-50 fs pulses, like those produced at currently 
operating facilities, should not yet be ruled out. The theory we present opens up new experimental avenues 
to measure the impact of damage on single particle diffraction, which is needed to test damage models and 
to identify optimal imaging conditions. 


1 Introduction 

X-ray free-electron laser (XFEL) pulses are envisioned to probe the structures of radiation-sensitive samples, like 
biological molecules, by outrunning radiation damage processes [l]. Current facilities, however, produce their 
brightest pulses with durations of the order of tens of femtoseconds [2|[^ , which is sufficient time for ionization to 
become widespread and for ions to move several Angstroms |4]|^. In spite of this, the first applications of XFELs 
to serial crystallography have been highly successful [6j[^. It turns out that even for longer pulses (~ 50-100 fs), 
Bragg diffraction probes the undamaged structure in the first few femtoseconds of the pulse-sample interaction, 
turning off at later times when radiation damage distributes the diffraction signal as a diffuse background . In 
this way, XFEL Bragg diffraction is effectively gated by damage because expected number of photons scattered 
to a Bragg peak is equivalent to that produced by a shorter pulse with the same intensity. 

Despite the great progress in coherent imaging using XFEL sources, the holy grail - atomic resolution of 
a single (non-crystalline) biomolecule - has not yet been realized. Nevertheless, the potential reward for 
success has kept this pursuit at the forefront of research in XFEL imaging science. One of the limiting factors 
is radiation damage. Eor non-crystalline samples, diffraction from the undamaged structure is not enhanced 
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by periodicity and is mixed indistinguishably with the diffraction of a damaged structure. This is seemingly a 
major setback for the prospects of developing 3D single particle imaging into a high resolution technique for 
single molecules. For example, Hau-Riege et al. found that radiation damage causes large discrepancies with 
the ideal diffracted intensities, which led them to conclude that pulses must be no more than a few femtoseconds 
long to avoid severe resolution loss. A more recent study with more detailed scattering models reached a similar 
conclusion 10 . However, these studies assessed feasibility with metrics inspired by crystallography whose 
suitability for single molecule imaging is disputed 11 . Without accounting in detail for the way that structural 


information is extracted from single molecule diffraction data, the issue of damage limits for single molecule 
imaging remains inconclusive. 

One of the most actively pursued routes to single molecule imaging involves measuring thousands of copies 
of a molecule one by one. The resulting data is extremely noisy and the molecular orientations are not known. 
The issue of molecular orientation must be resolved to assemble a 3D dataset, which can be performed by 
several algorithms 12 -15 . The hallmark of these methods is that they are able to cope with signals as low as 


0.01 photons per Shannon-Nyquist pixel 16 . After the 3D dataset has been assembled, the atomic structure is 


recovered via coherent diffractive imaging methods 17 


The crucial information needed to resolve the unknown orientations, and finally the structure, is contained 
in the modulations of diffraction signal arising from interference between different atoms, often called “speckles” 
(see Fig. [^. Radiation damage changes the structure of the sample dynamically such that the final diffraction 
pattern is the sum of the diffraction from many modified structures, each with a different distribution of ions and 
ion displacements. It has been shown that averaging the diffraction over different molecular configurations 18 
lowers the speckle contrast relative to the mean scattering intensity within each resolution shell. We expect 
radiation damage to cause a similar loss of contrast. Not only is the amplitude of the speckle structure reduced, 
but speckle structure also fluctuates from shot-to-shot due to damage, in addition to the fluctuations due 
to changing orientation and shot-noise. We will use the term “damage noise” to refer to these fluctuations 
of speckle structure due to damage. So far damage noise has not been considered in studies of 3D dataset 
assembly. Here we present calculations of damage noise per diffraction pattern due to spatially uncorrelated 
damage processes, which include ionization and ion diffusion but not the Coulomb explosion of the molecule. 
An analysis of damage noise as a function of pulse duration reveals a gating effect in single molecule diffraction, 
whereby long pulses measure an equivalent amount of information about the average structure to shorter pulses 
of the same intensity. Theoretical predictions of damage noise are also the first step to understanding how 
orientation determination and 3D data assembly can be performed with data affected by radiation damage. 

An alternative to alignment via post-processing is to experimentally align isolated gas-phase molecules, e.g. 
via quantum-state-selection methods 19,20. A great advantage of this approach is that multiple molecules 


can be illuminated simultaneously, increasing signal-to-noise and, as supported by the work here, reducing the 
impact of damage. These methods have been demonstrated only for small (2,5-diiodo-benzonitrile) molecules 
so far 19 20 and extensions to larger molecules are being actively pursued. If the molecules are aligned 


experimentally, the self-gating effect still applies. Radiation damage modifies each molecule in the beam uniquely 
and stochastically, so that multiple damage scenarios are averaged in a single diffraction measurement in an 
analogous way to crystallography. This increases the signal with respect to damage noise as well as shot noise. 
The self-gating effect ensures that such benefits from using multiple aligned molecules are not lost entirely by 
using x-ray pulses longer than 10 fs. 

Once the 3D data assembly has been performed, damage will still have a residual effect on the resulting 
3D diffraction volume. Damage reduces the contrast in the averaged diffraction volume 11 , and depending on 
the theoretical perspective, also contributes a background 21 . Promisingly the reduction in contrast can be 


accounted for during structure determination by treating the sample in terms of a small number of structural 
modes 11 . The background contribution is expected to be small for hard X-rays at beam conditions currently 
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available. 

In addition to analysing the damage noise, we show how the mean and standard deviation of the diffraction 
signal can be combined into a sensitive measure of damage. An advantage of the measure we propose is 
its sensitivity to both ionization and ion motion, whereas the mean signal alone depends only on ionization. 
There is a need to measure damage experimentally and provide some validation and clarification for theoretical 



valuable feedback on our theoretical understanding of the interaction between XFEL pulses and biomolecules, 
which is needed to develop single molecule imaging techniques. 


2 The effect of radiation damage on diffraction contrast 


The goal of single molecule imaging is to recover the initial position R of each atom in the sample. For simplicity, 
we will give equations for the case of a single atomic species, noting that the generalization to multiple atomic 
species is similar to that found in Ref. 11 . The intensity of a single measurement of a single molecule can be 
written 


= rlP{^)dnIo 
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( 1 ) 


where q is the scattering vector with magnitude q, dfl is the solid-angle term, Ve is the classical electron radius, 
N is the number of atoms and F’(q) is a polarization term that will be ignored in this discussion. To simplify 
mathematical notation, we assume the incident intensity takes a uniform value Iq for the duration of the pulse. 
We have defined 
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where ei{t) is the displacement of the atom from its initial position and T is the duration of the pulse. For 
a single two-dimensional measurement, it is understood that q is sampled at points on the Ewald sphere, but 
in general we will use q to be a general three-dimensional vector and I (q) is a three-dimensional function. The 
atomic scattering factor f{q,t) depends upon the ionization state of the atom, which changes as a function of 
time. The ionic scattering factors can be calculated using Slater orbitals and we use foiq) to denote the 
atomic scattering factor of the unionized atom. We assume that the probability of an ion having a particular 
ionization state at time t is independent of where that atom is located in the sample. Although the ionization 
state as a function of time is different for each atom, statistically atoms of the same atomic species are assumed 
to be equivalent. We write A{q) and B{q) as a function of the magnitude of the scattering vector, q, because 
we assume the atomic scattering factors are spherically symmetric. 

Consider an ensemble of 2D diffraction measurements, each with a unique damage scenario. Eor 3D imaging, 
the data needs to be assembled into a 3D intensity volume using an algorithm that accounts for the unknown 
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molecular orientations. The desired solution of the algorithm is an average intensity, where each 2D measurement 
is correctly placed according to orientation and the different damage scenarios are averaged. As shown in 
Appendix]^ the average intensity can be written in the form 


(/(q)) = r^P(q)dD/o 
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where we have 
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and e(t) is the root mean square (rms) displacement of an ion as a function of time. 

If the analysis is restricted to damage processes that are random and spatially uncorrelated, then we can 
treat the terms Ai (q) and Bij (q) as random variables and study the effect of damage statistically. We also treat 
the initial atomic positions as random with a uniform probability distribution, as is done in crystallography 
to analyse the statistics of Bragg intensities (Wilson statistics) at high scattering angles (q > 0.33nm“^) 

Both ionization and ion diffusion can be treated within this framework and, as we will show, are both involved 
in a self-gating pulse effect. Expansion of the molecule by Coulomb forces is not covered by the statistical 
treatment presented here, but is discussed further below. 

The second term on the right-hand side of Eq. Q is sensitive to the atomic positions and accounts for the 
contrast in the average diffraction pattern. We can treat this information as the “signal” we aim to measure. 
The contribution each atom makes to the signal is proportional to B{q), which is equal to the standard deviation 
of the diffraction in the merged 3D dataset divided by the number of atoms. The mean shot noise level, denoted 
by (Tat, is proportional to the square root of the intensity. We can estimate the mean shot noise level by 
considering the mean diffracted intensity in a shell of constant q, which can be derived by integrating Eq. Q 
and is proportional to A{q). When the signal is compared to the noise, the proportionality constants have no 
influence on the interpretation, so we drop them for simplicity and write 

aUq) = A{q). ( 8 ) 
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In addition to shot noise, there is the damage noise due to the variations in how the damage manifests in 
each measurement. One contribution to the damage noise is the fluctuation of Ai{q), which is characterized 
by the standard deviation of Ai{q), which we denote by <JA{q)- The second contribution to damage noise is 
the deviation of Bij{q) from the average speckle R(q), which has a standard deviation < 73 ( 9 ). The term asiq) 
is given by the difference between the standard deviation of the second term on the right-hand side of Eq Q 
minus the standard deviation of the second term on the right-hand side of Eq Q. In Appendices [C] and [P} we 
provide derivations of crA{q) and asiq) that give the following results: 

^0 Jo 
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By comparing the size of the signal to the size of the shot-noise and damage-noise levels, we can gauge how 
much information is contained by each measurement about the molecule’s structure. Here we will study how 
the diffraction pattern varies as a function of pulse duration and pulse energy. We propose the following 
signal-to-noise ratio to characterize the diffraction: 


SNRNoiq) 


NB{q) 

\/Na\{q) + N‘^(J%{q) + Na%{q) 


( 11 ) 


It is also interesting to compare the signal to the damage noise directly, ignoring shot-noise, with the following 
ratio: 


SNRoiq) = 


NB{q) 


^Na\{q)+N^al{q) 


( 12 ) 


To estimate SNRMoiq) and SNRjj^q), we need to calculate statistical averages of the scattering factor, 
6-g- (/(^jO); (/^(9j 1)) 6lc, which in turn depend on the expected number of ions in each ionization state as 
a function of time. To calculate B{q) and <JB{q) we also need to know the ion temperature as a function of 
time. These parameters can be calculated by many of the damage models reported in the literature so far, like 
molecular dynamics models and hydrodynamic (rate-equations) models [5,22 26 . Here we will present 


the results of a rate equations model to investigate single-molecule diffraction contrast and to explore the extent 
to which there is a self-gating pulse effect in single molecule diffraction. 

The term that has not been calculated before is the correlation between the scattering factor at different time 
points, e.g. (/((?, t)f{q, t')), which is needed to calculate the damage noise levels. To calculate these correlations 
we need to know the condition probability P{fn{q, t')\fmiq, t)), which gives the probability of an ion being found 
in ionization state n at time f given that it was in ionization state m at time t. We have developed a way of 
calculating these conditional probabilities, and hence the damage noise. First the damage simulation is carried 
out generating the populations of ion states at all time points and the transition rates between ion states are 
stored as a function of time. Starting with the mean ion population of state m at time t, the stored transition 
rates can be used to generate the fraction of these atoms in ionization state n at all later time points t' > t, 
from which the conditional probabilities can be readily inferred. 

We use a damage model based on a rate-equations model 


22 , which is extended to include ion diffusion 


using the methods from a non-local thermal equilibrium plasma model |5, 26 . The details of the model are 
given in Appendix]^ As we closely follow the methods of Refs. [5 22 , we expect the results and the validity 


our model to be similar. As we will show, there are sufficient physical processes in our model to illustrate the 
self-gating pulse effect in single molecule diffraction. 

All statistical quantities are given as weighted averages over the light elements (H,C,N,0). Sulphur was 
included in the rate-equations model of damage, but was excluded from the average of statistical diffraction 
quantities, like A(q), B(q) and <JB{q), because it is computationally intensive. Sulphur has a much larger number 
of possible electron configurations, and averages that depend on two time variables [e.g. <JB{q)\ took too long 
to compute for the range of beam conditions we study here. Since there are of the order of 100 sulphur atoms 
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and light atoms, our main conclusions are not expected to be affected by neglecting the diffraction from 
sulphur. 

We have set up our simulations using the chemical composition and size of GroEL. This chaperonin molecule 
is a candidate for first tests of single molecule imaging because it survives intact in mass spectrometry experi¬ 
ments 27 , which subject the molecule to similar conditions to injection at XFEL. It is also of sufficient size to 
scatter around 10^ photons per diffraction pattern, as shown in Fig. 

Simulations were performed at 8 keV photon energy (~0.155 nm wavelength) which is sufficient resolution 


for structural biology and similar to that demonstrated in simulation studies of single molecule imaging 16 


The principal effects of damage on molecular diffraction can be seen in Fig. which shows a simulation for a 
pulse duration of 40 fs, beam intensity of 5 x 10^° W cm“^ (corresponding to a 2 mJ pulse) and a 100 x 100 
nm^ spot size. Without damage A{q) would be equal to fo{q), but with damage it is reduced, attenuating the 
mean intensity by the same amount. The attenuation occurs at all resolutions, but is a greater fraction of the 
original signal at lower resolutions. The term B{q) is lower than A{q) because of the effects of ion motion, and 
the discrepancy is more pronounced at higher resolution. The deviations between A(q) and B(q) are important 
for accurate structure retrieval methods [11| . In this case, the most significant damage noise term asiq) is lower 
than B{q) across all resolutions, indicating that even for pulse durations as long as 40 fs damage noise does not 
exceed the signal from the average molecular structure. 

To illustrate the self-gating pulse effect in single molecule diffraction, we plot B{q) as a function of pulse 
duration for a constant photon energy (8 keV) and constant beam intensity (5 x 10^° W cm“^). We see in Fig. 
I^a) that the signal level at 0.15 nm resolution steadily rises until it plateaus at a maximum value at around 20 
fs. The signal at lower resolution accumulates for longer pulse times. Interestingly the noise due to radiation 
damage also rises non-linearly, accumulating at slower rate at longer pulse times. This is because the random 
distribution of ions in the sample has smaller variation when the bound electrons are almost entirely depleted 
from each ion. The signal-to-noise ratios, shown in Fig. ib), show strikingly that shot-noise has a much 
greater effect than damage noise. Although SNRjjiq) improves greatly for short pulses (<5 fs), SNRd+n{<}) 
maximizes when the signal B{q) maximizes at around 20 fs. 

The results are interesting when there is trade-off experimentally between pulse duration and pulse energy. 
For example, the LCLS can produce 2 mJ pulses with pulse durations of 30-50 fs for hard x-rays [^. Sub 
5 fs pulses can be produced by the LCLS using a low charge method or a slotted foil method, but at the 
expense of around a factor of ten in pulse energy. Given such a choice, the analysis presented here suggests 
that the gain in signal from a longer pulse with higher pulse energy compensates for the increase in damage. 
We note though, this conclusion only applies to spatially uncorrelated damage processes like ionization and ion 
diffusion (not a Coulomb explosion). Figure|^shows that SNRo+N{q) and SNRoiq) have a weak dependence 
on pulse duration at constant pulse energy. This suggests that maximizing pulse energy has a greater influence 
on the success of single molecule imaging than pulse duration with respect to the spatially uncorrelated damage 
mechanisms considered here. 

If multiple molecules were simultaneously aligned and exposed to the x-ray pulse (as described in the 
Introduction), we would still expect a gating effect qualitatively similar to that shown in Fig. However, we 
would expect S'A^i?£)+Ar(g) and RNR^iq) to scale as y/N^oi, where A^moi is the average number of molecules 
in the beam for each exposure. This is because the signal is proportional to A^moi, while standard deviations 
of the damage noise and shot noise scale as V-^moi- This analysis is missing the additional fluctuations due to 
the coherent interference between molecules, which have been considered in the context of angular correlation 
methods 
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3 A method of measuring damage experimentally 


The statistical analysis of diffraction contrast can be used to measure the amount of damage in single molecule 
experiments. The average change to the atomic structure factors, characterized by A{q), can be readily mea¬ 
sured by summing diffraction patterns. This provides some information about ionization levels but not ion 
motion. There is more information to be gained by analyzing the fluctuations of the diffraction signal. It is 
not convenient to measure SNRD^pf{q), because B{q) cannot be measured directly without resolving the issue 
of unknown orientations and assembling a 3D dataset, effectively accomplishing a full imaging experiment. An 
experimentally simpler proposition, which is independent of the imaging experiment, is to measure the standard 
deviation of the signal within each resolution ring, averaged over all of the measured diffraction patterns. The 
standard deviation is proportional to {Bfj(q)) and is a measure of the speckle contrast. It will contain both 
contributions from the average structure of the sample and the damage noise. Unfortunately it is not clear how 
to separate those two contributions experimentally. Nevertheless, the standard deviation is a sensitive measure 
of any dynamical change in the sample structure because it will drop relative to the mean scattering signal, as 
has been shown for averages of molecular conformation 18 . To isolate the effect of damage-induced structural 


change, we create a measure that first subtracts the expected contribution of shot noise, which is equal to 
/ipix((j'), and then normalizes by the mean intensity as follows: 


Diq) = 


g^pix(g) - Mpix(g) 

Mpix(9) 


(13) 


where /ipix(<?) is the average intensity at a pixel in resolution ring q averaged over the whole dataset and <Jpix{q) 
is the corresponding standard deviation. The mean and standard deviation are calculated from the ensemble of 
experimental data of molecules measured individually in random orientations. It possible to show that 


D{q) 


(BUq)) 

AHq) 


(14) 


where {Bf^{q)) is given in Appendix O It is possible to show that 0 < D{q) < 1, because {f{q,t)f{qA'))'^ < 
{fHq,t)){P{q,t'))- Figure]^ shows D(g) for variations of pulse duration at constant pulse energy (2 mJ). The 
large variations at high scattering angle indicate the sensitivity of D(q) to ion motion and inner shell ionization, 
thereby providing complementary information to a measurement of A{q). The term D{q) provides a new means 
of comparing damage simulations to experiment, and testing the assumptions that underpin damage models for 
the single molecule case. 

For low diffraction intensities, the dominant error in the calculation of D[q) from experimental data is the 
error of Mpix(<?), given by 


^Mpix(g) = 




\/NY)ATA\/M{q) 


(15) 


where A^data is the number of diffraction patterns recorded. The term M(q) is the number of speckles in 
resolution ring q, which is estimated by dividing the circumference of the ring by the expected speckle width 
g, where d is the width of the molecule. Assuming D{q) is of the order of one, the error in D{q) goes like 
5D{q) « |<5Aipix(9)|/lMpix(9)|- For the test molecule quoted above and 8 keV photon energy, 2 mJ pulse energy, 
100 X 100 nm^ spot size at a resolution oi q — 6.67 nm“^, an accuracy of SD{q) — 0.01 can be achieved in of 
the order of 10^ patterns, which is an order of magnitude less than the number required to achieve the same 
resolution in an imaging experiment 


16 . This analysis could be used to gain early feedback about the data 


used in an imaging experiment. 
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4 Discussion 


The results presented on damage noise have implications for the feasibility of determining assembling the 3D 
diffraction volume from the ensemble of noisy 2D measurements. The data-assembly algorithms use informa¬ 
tion common to different diffraction measurements to resolve unknown information about molecular orientation. 
Predicting the level of damage noise in individual 2D diffraction measurements is a first step toward understand¬ 
ing how damage affects these algorithms. The prediction that SNRu is greater than one even for longer pulse 
durations (>20 fs) is a preliminary indication that damage noise will not prevent data assembly under conditions 
currently available in experiment. This is because the contribution to the diffraction from the average molecu¬ 
lar structure is greater than the shot-to-shot fluctuations of the diffraction, and it is the contribution from the 
averaged structure that is used to resolve the problem of unknown molecular orientations. That SNRD^]\[{q) 
is lower than SNRD{q) by more than an an order of magnitude (see Fig. shows that shot noise dominates 
damage noise. This can be viewed positively because data-assembly algorithms can already cope with very low 
shot noise levels when assisted by a priori knowledge about the shot noise statistics 12 13 . However, shot noise 


applies per pixel and is well understood to be a Poisson process, whereas damage noise applies to features the 
size of a speckle and the underlying distribution is hard to predict analytically. Detailed studies of the effects 
of damage on the performance of data assembly algorithms are still required. 

Our study is restricted to spatially uncorrelated damage processes. One significant omission is the expan¬ 
sion of the molecule due to the large electrostatic forces created by the positively charged molecule and the 
redistribution of trapped electrons. Hydrodynamic simulations have predicted that atoms at the surface can 
move distances comparable the molecule’s size on a time-scale of tens of femtoseconds 22 , while the interior 


of the molecule moves less in the same time frame, because the trapped electrons redistribute to neutralize the 
central part of the molecule. The interior atoms will still produce a significant diffraction signal for resolving 
unknown orientations and assembling the diffraction data. If the surface atoms have moved significantly, they 
will contribute less to the assembled 3D diffraction data than the interior atoms. If the scattering of surface 
atoms do prove to reduce relative to the bulk, it is an outstanding question as to how to account for this during 
structure determination, but modal methods for studying diffraction leave options open 
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Since damage has been measured in nanocrystallography experiments, it is worth drawing a distinction 
between damage in crystals and in single molecules. In a crystal, damage ionizes and displaces ions differently 
in each unit cell, so that the diffraction contains an average over many different damage scenarios. For a single 
molecule, there is only one damage scenario per measurement and hence we expect a bigger standard deviation 
of diffraction of single molecules than of nanocrystals. Additionally, nanocrystals are much larger than single 
molecules, so that the rates at which electrons are trapped is different and the time it takes for a photoelectron 
to escape is longer. The water that surrounds a nanocrystal injected via a liquid jet 29 also contributes to 


the damage in the form of additional photoelectrons and secondary electrons. It is proposed to inject single 
molecules via aerosol injection 30 , so that they are surrounded by vacuum, because the background water 
scattering from a liquid jet would dominate the diffraction from the molecule. For these reasons, damage 
experiments on single molecules, independent of those on crystals, are needed to draw conclusions for single 
molecule imaging. 

At the x-ray energies required to reach atomic resolution 10 keV), Compton scattering becomes another 
significant source of background scattering 31 . The background is predicted to depend on the magnitude of 
q, and would increase the noise level ctat by adding to the right hand side of Eq. Q. It has been predicted 
that for for beam intensities currently availa ble at hard x-ray energies, the Compton background only becomes 
significant at resolutions greater than 2 A 


31 


Hence, Compton scattering is not expected to significantly 


influence the results presented here. 











5 Conclusion 


We have analyzed shot-to-shot damage-noise fluctuations for single molecule diffraction. For spatially uncorre¬ 
lated damage processes, there is a clear damage gating effect by which longer pulses measure the same average 
diffraction contrast as shorter pulses with the same intensity. The results further suggest that pulse energy is 
more important than pulse duration for maximizing signal to noise for these damage processes. In other words, 
a pulse 30 fs in duration may be preferable to a sub 5 fs pulse, if the later has an order of magnitude less pulse 
energy. If both 30 fs and 5 fs pulses have same pulse energy, then the shorter pulse is preferable because damage 
is reduced, which may be important for damage processes not considered here like the Coulomb explosion. These 
results provide a preliminary indication that the prospects of resolving molecular orientations to assemble in a 
3D diffraction volume in the presence of damage are favorable with data from current facilities. We have also 
proposed a statistical measure of damage that could be applied experimentally to provide valuable feedback for 
modeling XFEL damage to single biological molecules. 


A Description of the rate-equations model 


We use a damage model based on a rate-equations model 
the methods from a non-local thermal equilibrium plasma model 
rates of Auger decay were taken from Ref. 


Ref. 32 


which is extended to include ion diffusion using 
^p6 . Rates of photoionization are taken from 


and atomic energy levels were taken from Ref. 34 


Secondary impact ionization rates were taken from Refs. |35[|36| . Ejected electrons are assumed to be trapped if 
their kinetic energy exceeds the trapping energy of the ionized molecule [22| . We assume a spherical geometry 
for this calculation, and this is the only place geometry is included in the calculation. Both photoelectrons and 
some of the Auger electrons have sufficient energy to escape at early times. All of the trapped electrons are 
assumed to thermalize on a sub-femtosecond time scale, so that the energy distribution is Maxwell-Boltzmann, 
but the mean temperature changes with time. We include all ionization states of each element and the electron 
orbitals for each ionization state were modeled using Slater-type orbitals [^ . 

There are some minor differences between our model and the published models on which it is based. We 
include all the shells for sulfur (in Ref. 22 it was restricted to 8 electrons). This introduces high energy Auger 
electrons that are able to escape the molecule under the same conditions as the photoelectrons. We do not 
consider ionization due to potential lowering, as is done in Ref. [26| . We also omit the expansion of the molecule 
under electrostatic forces in order to focus on the spatially uncorrelated motion that is implicated in the self¬ 
gating pulse effect. The expansion of a protein molecule has been predicted to affect atoms less than one tenth 

These atoms can move several Angstrom during interaction with 


22 


of the molecule’s radius from the surface 
the pulse, which will greatly diminish their contribution to the diffraction contrast. The rest of the atoms are 
only weakly affected by expansion because the trapped electrons effectively neutralize the core, for which we 
would expect better agreement with the theory presented here. 
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B Derivation of Eq. 

The intensity of a measurement can be written as 


/(q) = rlP{ci)dnio 
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N i-1 


i—1 i—1 j — 1 

X cospTrq • (R^ — Rj + ei{t) — ej(t))]dt 


where the definitions of all terms are given in the main text. We can expand the cosine term as: 

cos[27rq • (R^ — Rj + ei{t) — ej{t))] =cos[27rq • (R^ — R^)] cospTrq • {ei{t) — 

— sin[27rq • (R^ — Rj )]sin[27rq- (ei(t) - ej{t))] 

We can further expand the terms that depend upon the displacement as: 

cos[27rq • {ei{t) — £j{t))] = cos[27rq • ei(t)] cos[27rq • ej(t)] + sinpTrq • €i{t)] sin[27rq • ej{t)] . 

The ensemble averages of individual cosine and sine terms over different random displacements are 

\ f 1 _ (g 

cos[27rq • ei(t)] ) = / cos[27rq ■ ej(t)] , e 






and 


r 1 

sin[27rq-ei(t)]) = / sin[27rq ■ e^(t)] e dei 

J v27re(t) 

= 0 . 


We assume that ionization and atomic motion are statistically independent so that 

cospTrq- ei(t)]) = {f,{q,t)fj{q,t)) (cos[27rq • e,(t)]) 
We assume that the ionization of different atoms is statistically independent so that 


(16) 


(17) 


(18) 


(19) 


( 20 ) 


( 21 ) 


( 22 ) 


if z ^ j. We assume that all the atoms of the same element are equivalent statistically, so that averages of 
fi{q,t) and £i(t) are independent of i. Combining the above results we get 


/*('7,i)/i(9,i)cos[27rq-ei(t)]cos[27rq-e^(<)]) = {f{q,t))^e 


\2 


(23) 


Substituting Eq. (231 into Eq. (161 leads to Eq. using the definitions of A{q) and B{q) in Eqs. (§ and 
respectively. 
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C Derivation of the variance of Ai{q) : Eq. 

The standard deviation of the sum of Ai{q) terms in Eq. denoted by aA{q), is given by 

1 


0 -^( 9 ) = 


N 


N 




N 


- ( 


with 


Aziq)=[ ff{q,t)dt. 
Jo 


(24) 


(25) 


Equation (24) is scaled the number of atoms to give the contribution per atom. We ignore the i dependence 
when writing (JA{q) because we assume all atoms of the same element are equivalent. Using the assumption 
that ionization on different atoms is statistically independent, we can write 


N 


-i 2, 




/ N T N 

= ( E / [ f]{q,t')dt' 

3^1'’ 


N 


T rT 


N .T 

= E/ ifiiQA)f!{q,t'))dtdt'+ 


{f[{q,t)){fHq,t'))dtdt' 


JO 


2=1 


= N f {fAq. t)fAq, t'))dtdt' + N{N - 1) 
^0 


n 2 


{fi{qA))dt 


Therefore, 


4(.) = ^ 


N 


Y.Mq) 


2=1 


N 


-{Y.Mq) 


U=1 


= / {fKqA)fi{qJ))dtdt' - 
Jo 


{ffiq,t))dt 


(26) 


(27) 


D Derivation of the variance of Bij{q) : Eq. (10) 

The term asiq) gauges the magnitude of the damage noise fluctuations per atom due to the second term on 
the right-hand side of Eq. (§. Its square is related to the difference between the variance of the second term 
on the right-hand side of Eq. 0 and that of the second term on the right-hand side of Eq. 0^ which is given 
as follows 


4(<?) = ^ 


al{q) --{N^ - N)B\q) 


(28) 


where (Js{q) is defined to be the standard deviation of the second term on r.h.s. of Eq. ([^ and is given by 

N 2—1 N r—1 

4(9)=4EEEE(^b(9)S-(9)) • (29) 

2=1 j — 1 r—1 s — 1 
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The second term on the right-hand side of Eq. (16) contains terms with the form 


where we have defined 


and 


Bij{q)= [ cos[2Tr(i-{Ri-Rj + ei{t) - ej{t))]dt 

Jo 

= Bc{q) cospTrq • (R^ - R^)] -h B^iq) sinpTrq • (R^ - R^-)] 

Bc{q)= [ Mq,t)fj{q,t) cos[2nci- {e,{t) - ej{t))]dt 
Jo 


Bs{q) = [ f^{q,t)fJ{q,t) sm[2Trq ■ {ei{t) - ej(t))]dt . 
Jo 

Using Eq. (201 we can show that 


and thus write 


{Bs{q))=Q 


{B{q)) = (R,(g)) . 


We evaluate {Bfjq)) as a first step to calculating the standard deviation. 


(30) 

(31) 

(32) 

(33) 

(34) 


(Bijiq)) = {Bc{q) cos[27rq • (R^ - Rj)] Bsiq) sin[27rq • (R^ - R^)]} 

= (■Bc(g))(cos^[27rq • (R^ - R^-)]) -h {B^{q)){sin^^Jirq ■ (R, - R^-)]) 

= I [(B!(q)} + (B^(q)}] ■ 


(35) 


Going from the first to the second line of Eq. (35), we have used the assumption that the positions of the atoms 
are random, so that 


cospTrq • (Rj — Rj)] sinpTrq • (Rj — Rj)] ) = 0 


and, in the last line of Eq. (35), we have 


(36) 


(cos^[27rq • (R^ — Rj)]) = (sin^pTrq • (R^ — Rj)]) = 


(37) 


To evaluate Eq. (351, we start by evaluating {B'l{q)) as follows 

cT pT 


{Bciq))=[ [ {fi[q,t)fi{q,t')){fj{q,t)fj{q,t')) 

Jo Jo 


(cospTrq • {ei{t) — ej{t))] cos[27rq • — ej(t'))]) dtdt' 


(38) 
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Writing Ci{t) = cos[27rq • we can write 

{^cospTrq- (ei(t) - cospTrq • (ei(t') - ej(t'))]'^ 

= ( lc^(i)cJ(t) + s^(t)sj(t)] [ci(t')cj(t') + s,(t')sj(t')] j 
= (c,(t)c^(t')}(cj(t)cj(t')} 

+ {c,{t)Si{t')}{Cj{t)Sj{t')) 

+ {Si{t)Ci{t')){Sj{t)Cj{t')) 

+ {Si{t)s,{t')){Sj{t)Sj{t')) 

= {cit)c{t')y + {sS)s,{t')f . (39) 


The term (c(t)c(t')) is given by 

(c(t)c(t')) = J j cospTrq • e(t)] cos[27rq • e(t')]P[e(t), e(t')](ie(t)de(t') . 


(40) 


The joint probability function is 

P[e[t),e{t')] = P[e{t)\e{t')]P[e{t')]. (41) 

Assume that t > t'. We then assume that the conditional probability is probability of taking a random walk 
from position e(t') at time f to position e{t) at time t, and takes the form 

, o 

^ -|6t-e 




g 


where e(t,t') is given by the integral of the diffusion coefficient as a function of time 


=2Nd d(t")dt" . 


(42) 


(43) 


The term Nd is the number of dimensions, which we will take to be one because we are only interested in 
diffusion in the direction of the scattering vector. The diffusion coefficient is given by 


d{t) = 


hTjt) 

mv[t) 


(44) 


where kh is Boltzmann’s constant, T(t) is the ion temperature, m is the ion mass and v{t) is the collision 
frequency. To evaluate Eq. (40), we first write each cosine term as a sum of exponentials 


cos[27rq • e{t)] = g-2,r*q.e(t)] 


= E 


,(-l)'" 27 ri( 3 -e(t) 


m—0 


We then solve two integrals of the form 


_e-ax -bx^^ ^ ^ 

TT 


(45) 


(46) 
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The first integral is over e(t), with a = ^ “ eHlt') (~l)’”27r<7i. The argument of the resulting 

exponent is 

^ = I + (-l)'”27riq • €(t') - 2TT^q^€^(t, t') . (47) 

The second integral over e(t') has 

_ 1 1 1 _ 1 
2e’^{t,t') 2e^(t,t') 2ea^(t') 2e^{t') 

b = (—l)"* 27 rqi + (—l)" 27 rgj 

^ = -27r2e2(t')g2[(_i)™ + (-1")]2 . (48) 

The final summation over to, n = 0,1 gives the following result for t > t'\ 


J cos[27rg • e(t)] cospTrg • €{t')]P[e{t), e{t')]de{t)de{t') 

= _|_ g-87i-29=e=(t')j ^ 

The corresponding sine integral evaluates to 


(49) 


sin[27rg • e(t)] sin[27rg • e{t')]P[e{t), e{t')]de{t)de{t') 

_ g-87r^g^£^(d)j _ 


Adding the cosine and sine integrals, we get 


(c(t)c(t'))' + + g-i6-^=^^(t')] (t > t') . 


(50) 


(51) 


To complete the evaluation of Eq. (351, we still need to evaluate (i?f(g)) which is given by 

{Bl{q))=( [ {Mq,t)Mq,t')){fj{q,t)fj{q,t')) 

Jo Jo 

(sinpTrq • (ei(t) — ej{t))] sinpTrq • — ej{t'))])dtdt' . 

This equation can be written in the form 


(52) 


(sinpTrq • (ei(t) - ej{t))] sinpTrq • (ei(t') - (<'))]) 

= {[Si{t)Cj{t) - C,{t)Sj{t)] [s*(<')cj(t') 

= 2{ci{t)ci{t')){sj{t)sj{t')) . 


Using Eqs. (49) and (50) we can write this as 


2{ci{t)c^{t')){sj{t)sj{t')) 






it > t') . 


(53) 


(54) 
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We can write the time integrals as 


Jo Jt' 


rT pt' 




(55) 


/o ^0 


Using the property that = —e^{t\t), Eq. (551 can also be written as 

rT pT 


{bUq)) + (B^siq)) = f [ 

^0 Jo 
^ {B\q)) ■ 


(56) 


Using Eqs. (35) and (56) and that {Bij) = 0, we can calculate the standard deviation of Bij (denoted (JB. .{q)) 
to be 


(BUq)) = ^^^^^(/(<?,t)/(g,i'))^e-4-^^^l"^(‘’*')ldtdt' . 


(57) 


We have now reached a point where we can evaluate crs{q), given by Eq. (29). The averages of terms 
{Bij(q)Brs{q)) are zero unless i,j = r,s, because the averages over the positions R equal zero. Therefore, 

N i-1 

^liq) 

i=l 3 = 1 

-N , , , ,, 

- (B^Aq)) 


= 4- 


= iN^-N) r r(/(g,t)/(q,0)"e-4"'«'l"'(‘’‘')ldtd<' 
Jo Jo 


(58) 


Using this result in Eq. (28), we obtain the following result: 

1 


^Biq) = ^ 


<ji{q)--{N^-N)B\q) 




rT pT , 


/O ^0 


{f{qA)f{qA')) e 




2/r^„ ^/',^2^-4,r"9"e(t)%-47r=9"e(t')" 


-{mt)r{f{qA')ye 


dtdt' . 


(59) 


Assuming that N is large, the term of A can be ignored. 
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Figure 1: A graphical representation of ion diffusion in GroEL, where ion locations are chosen stochastically 
using the time-dependent temperature. Simulation parameters are: 8 keV; 5.0 x 10^° W cm“^; and 100 nm 
pulse diameter. Ionized hydrogen (white) moves much faster than ions of other elements. The diffraction pattern 
for each time point is shown below and was generated by randomly assigning each atom an ionization state 
and a displacement according to a rate-equations model described in Appendix]^ Large changes to the speckle 
structure are predicted at high resolution, as shown by the enlarged inset regions. The effect of shot-noise is 
shown on the right half of each diffraction image. 
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Figure 2: The effects of damage on the atomic structure factor. The term fo{q) is the undamaged atomic 
scattering factor for an unionized carbon atom, A{q) is proportional to the mean intensity per carbon atom at 
each resolution shell, B{q) is proportional to the speckle contrast for carbon and asiq) is the standard deviation 
of the shot-to-shot fluctuations of the speckle due to damage. When there is no damage A{q) and B{q) are 
equal to /q ( 9 ). The simulation parameters were 8 keV photon energy, 40 fs pulse duration, 2 mJ pulse energy 
and spot size of 100 x 100 nm^. 
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Figure 3: (a) Scattering and noise levels (due to damage only) as a function of pulse duration for constant 

incident intensity (5 x 10^° W cm“^) at 8 keV photon energy and 100 x 100 nm^ spot size. B{q) is proportional 
to the speckle contrast and we define N{q) = ^Ja\{q)/N + cr^(g), which is the denominator in Eq. (12) and 
measures the average contribution to the damage noise per atom, (b) Signal-to-noise ratios with and without 
shot noise for a resolution of 0.15 nm. 
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Figure 4: Maximum signal-to-noise ratios with and without shot noise for a resolution of 0.15 nm for 8 keV 
photon energy, 100 x 100 nm^ spot size and constant pulse energy of 2 mJ. 



q (nm ’) 


Figure 5: The function D{q) for different pulse durations for 8 keV photon energy, 100 x 100 nm^ spot size 
and constant pulse energy of 2 mJ. 
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